Perfectly load-balanced, optimal, stable, parallel merge

نویسندگان

  • Christian Siebert
  • Jesper Larsson Träff
چکیده

We present a simple, work-optimal and synchronization-free solution to the problem of stably merging in parallel two given, ordered arrays of m and n elements into an ordered array of m+ n elements. The main contribution is a new, simple, fast and direct algorithm that determines, for any prefix of the stably merged output sequence, the exact prefixes of each of the two input sequences needed to produce this output prefix. More precisely, for any given index (rank) in the resulting, but not yet constructed output array representing an output prefix, the algorithm computes the indices (co-ranks) in each of the two input arrays representing the required input prefixes without having to merge the input arrays. The co-ranking algorithm takes O(logmin(m,n)) time steps. The algorithm is used to devise a perfectly load-balanced, stable, parallel merge algorithm where each of p processing elements has exactly the same number of input elements to merge. Compared to other approaches to the parallel merge problem, our algorithm is considerably simpler and can be faster up to a factor of two. Compared to previous algorithms for solving the co-ranking problem, the algorithm given here is direct and maintains stability in the presence of repeated elements at no extra space or time cost. When the number of processing elements p does not exceed (m + n)/ logmin(m,n), the parallel merge algorithm has optimal speedup. It is easy to implement on both shared and distributed memory parallel systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

External Sorting for Databases in Distributed Heterogeneous Systems

A common approach to external parallel sorting in parallel database query processing is to split the data of initial runs into partitions. These partitions are assigned statically to the processes of the merge phase to produce a globally sorted result. This strategy may lead to low performance if some processes are overloaded caused by data skew or load imbalances. In this paper we describe a n...

متن کامل

Fast Sorting on a Distributed-Memory Architecture

We consider the often-studied problem of sorting, for a parallel computer. Given an input array distributed evenly over p processors, the task is to compute the sorted output array, also distributed over the p processors. Many existing algorithms take the approach of approximately load-balancing the output, leaving each processor with Θ( p ) elements. However, in many cases, approximate load-ba...

متن کامل

Implementing the 3D Alternating Direction Method on the Hypercube

The paper considers computational domains structured as a 3D grid of cells. It presents a cell-to-hypercube map that is useful for implementing the Alternating Direction Method (ADM). The map is shown to be perfectly load-balanced, and to optimally preserve adjacencies between cells in the computational domain.

متن کامل

Percentile Finding Algorithm for Multiple Sorted Runs

External sorting is frequently used b>relational database s!-stems for building indexes on tables, ordered retrieval, duplicate elimination, joins, subqueries. grouping, and aggregation; it would be quite beneficial to parallelize this function. Previous parallel external sorting algorithms found in the database literature used a sequential merge as the final stage of the parallel sort. This re...

متن کامل

The load distribution problem in a processor ring

Given a global picture of the system load and the average load, the load distribution problem is to find a suitable schedule, consisting of the amount of excess load to transfer along every edge, so that the system load can be balanced in minimal time by executing the schedule. We study this problem for the ring topology. We discuss some existing algorithms, show how they fall short of being ab...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1303.4312  شماره 

صفحات  -

تاریخ انتشار 2013